Density-based Clustering

Density-based methods group data points together based on density instead of distance.

How does it work

  1. define two hyperparameters:
    • a distance parameter ϵ = the maximum distance
    • a quantity parameter n = the minimum number of examples to put in a cluster
  2. Pick an example x from your dataset at random and assign it to cluster 1, then count how many examples have the distance from x less than or equal to ϵ.
    • If this quantity is greater than or equal to n, then put all these ϵ-neighbors to the same cluster 1.
    • Examine each member of cluster 1 and find their respective ϵ-neighbors. If some member of cluster 1 has n or more ϵ-neighbors, expand cluster 1 by adding those ϵ-neighbors to the cluster.
      1. Continue expanding cluster 1 until there are no more examples to put in it.

It uses a parameter called the minimum cluster size (MinClusterSize) in addition to ε and MinPts.

Examples